Simpute: An Efficient Solution for Dense Genotypic Data
نویسندگان
چکیده
Single nucleotide polymorphism (SNP) data derived from array-based technology or massive parallel sequencing are often flawed with missing data. Missing SNPs can bias the results of association analyses. To maximize information usage, imputation is often adopted to compensate for the missing data by filling in the most probable values. To better understand the available tools for this purpose, we compare the imputation performances among BEAGLE, IMPUTE, BIMBAM, SNPMStat, MACH, and PLINK with data generated by randomly masking the genotype data from the International HapMap Phase III project. In addition, we propose a new algorithm called simple imputation (Simpute) that benefits from the high resolution of the SNPs in the array platform. Simpute does not require any reference data. The best feature of Simpute is its computational efficiency with complexity of order (mw + n), where n is the number of missing SNPs, w is the number of the positions of the missing SNPs, and m is the number of people considered. Simpute is suitable for regular screening of the large-scale SNP genotyping particularly when the sample size is large, and efficiency is a major concern in the analysis.
منابع مشابه
A Heuristic Algorithm for Nonlinear Lexicography Goal Programming with an Efficient Initial Solution
In this paper, a heuristic algorithm is proposed in order to solve a nonlinear lexicography goal programming (NLGP) by using an efficient initial point. Some numerical experiments showed that the search quality by the proposed heuristic in a multiple objectives problem depends on the initial point features, so in the proposed approach the initial point is retrieved by Data Envelopment Analysis...
متن کاملAn Efficient Resource Allocation for Processing Healthcare Data in the Cloud Computing Environment
Nowadays, processing large-media healthcare data in the cloud has become an effective way of satisfying the medical userschr('39') QoS (quality of service) demands. Providing healthcare for the community is a complex activity that relies heavily on information processing. Such processing can be very costly for organizations. However, processing healthcare data in cloud has become an effective s...
متن کاملSupercritical Fluid Extraction of Carotenoid from Microalgae with Projected Thermodynamic Models (RESEARCH NOTE)
In this study, two thermodynamic models (regular solution theory and equation of state) were applied to obtain carotenoid solubility in the supercritical carbon dioxide solvent. Theoretical data obtained from the models were compared with the experimental data extracted from a published paper. The use of equation of state as an empirical correlation for collating and predicting liquidliquid and...
متن کاملOptimizing Disparity Candidates Space in Dense Stereo Matching
In this paper, a new approach for optimizing disparity candidates space is proposed for the solution of dense stereo matching problem. The main objectives of this approachare the reduction of average number of disparity candidates per pixel with low computational cost and high assurance of retaining the correct answer. These can be realized due to the effective use of multiple radial windows, i...
متن کاملAn efficient approach for availability analysis through fuzzy differential equations and particle swarm optimization
This article formulates a new technique for behavior analysis of systems through fuzzy Kolmogorov's differential equations and Particle Swarm Optimization. For handling the uncertainty in data, differential equations have been formulated by Markov modeling of system in fuzzy environment. First solution of these derived fuzzy Kolmogorov's differential equations has been found by Runge-Kutta four...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
دوره 2013 شماره
صفحات -
تاریخ انتشار 2013